Lab book for analyses using hierachal computational modelling to identify paramters that define the best model of learning as it applies to fear conditioning acquisition and extinction using FLARe fear conditioning data. Long abstract, justification and analysis plan found in prelim manuscript here
In short:
Cross validate best fitting model in TEDS data
Evidence from both human (Richter et al., 2012) and rodent (Galatzer-Levy, Bonanno, Bush, & LeDoux, 2013) studies suggest that trajectories of how we learn and extinguish fear differ between individuals. Different trajectories of fear and extinction have also been found using fear conditioning studies (e.g. Duits et al., 2016), a good model for the learning of, and treatment for, fear and anxiety disorders. It is likely that these trajectories of fear extinction might predict outcomes in exposure-based cognitive behavioural therapy (Kindt, 2014).
Identifying parameters that predict individual trajectories of fear learning and extinction will enable us to harness fear conditioning data more effectively to aid in understanding mechanisms underlying the development of and treatment for anxiety disorders. With more accurate models of these processes, the potential to use fear conditioning paradigms to predict those most at risk of developing an anxiety disorder, and those who might respond best to exposure-based treatments, greatly improves.
Sutton and Barto Reinforcement Learning - Textbook on reinforcement learning
Anxiety promotes memory for mood-congruent faces but does not alter loss aversion (Charpentier…Robinson, 2015) - Good example of a sensitivity learning parameter
Hypotheses About the Relationship of Cognition With Psychopathology Should be Tested by Embedding Them Into Empirical Priors (Moutoussist et al., 2018) - Including variables of interest (e.g. anxiety) in the model
Toby Wise has just submitted an aversive learning paper incorporating beta probability distributions in the best model for uncertain learning paramters etc.
A copy of this is
Select best fitting model
Will use a combination of R.Version(3.5.1), RStan (Version 2.18.2, GitRev: 2e1f913d3ca3) and hBayesDM package in R (3.5.1) Ahn, W.-Y., Haines, N., & Zhang, L. (2017). Revealing neuro-computational mechanisms of reinforcement learning and decision-making with the hBayesDM package. Computational Psychiatry, 1, 24-57., which uses RStan
Discussion with Vince Valton and Alex Pike about the best way to fit this model. As the observed outcomes (expectancy ratings) are non binary and are related to eachother (i.e. as you become more likely to select 9, you become less likely to select 1) we should consider each trial for each person for each stimulus as a constantly updating beta distribution. so you might see a pattern like this for the CS+ in acq for example.
So, best model is likely to be one using beta distributions that show the probability distribution for each rating.
We can use sufficient parameters to describe these (i.e. mean / sd or possibly the mode)
scaling
We can scale the beta by how aversive participants find the shock. i.e. it might update their learning as if there was .5 a shock or 1.5 of a shock depending on their own sensitivity to the aversiveness / punishment.
alpha
generalisation
We can do this with a single beta distribution for each phase (collapsing over the two stimuli). This would be akin to a per phase generalisation paramaterer in that it will be smaller if they tend to choose the same expectancy for both stimuli and larger if they tend to choose very differently for both stimuli.
However, because these variables are not really equivalent (i.e the reinforcement rate is different for both, and we use this in the model)
So instead we can create a paramater which is the value of cs- weighted by some value of the cs+. How much each individual weights by the Cs+ can be freely estimated by the model and can be the generalisation paramter.
So this would be vminus = vminus + (w)vplus (where the w paramter is the freely estimated paramter per person)
per stimulus We probably want to model cs+ and cs- separately too - so have a beta distribution characterised by sufficient parameters for each.
per trial
All of the above can then also be done with updating per trial.
leaky beta
we also need a model that incorporates ‘leak’. i.e. learning leak - likely that participants will update more based on more recent trials and learn less from the more distant trials as time progresses. See Toby’s paper for more.
uncertainty
We should consider incorportating a paramter that maps to participant uncertainty about outcomes.
anxiety
Might be worth incorporating this as a model paramater / feature. Read this for more.
V == ‘value’. Baasically a paramter that is about the salience of the stimulus at any given point.
alpha == ‘learning rate’. A parameter that describes how sensitive people are to updating their learning. So a fast learning rate means that learning on any given trial is weighted more based on the trials immediatly preceding than past ones, and a slow learning rate means that all past events influence learning more evenly. Alex’s tennis analogy is good here (Federer - stable player, can predict a win based on all matches; Murray - volatile player; his last match is best predictor of next match performance). beta == ‘confidence’. This is sort of an error term - how much variance in rating choice is there for each person/trial. Can be thought of as the variance, or beta^2 as the sd.
Can be confusing as we are using beta distributions (different thing) which has two sufficient parameters a + b).
and how they change depending on whether you change the beta or alpha paramters.
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[1] 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
[1] "stable beta, increasing alpha"
[1] "stable alpha, increasing beta"
Will probably do all per trial. Will do an early sensitivity check to confirm this.
Alpha Learning rate paramter. If high then will be very influenced by previous trial events, if low, then will be more standardly influenced by accumulating events.
Betas Variance/certainty parameter
These use Alex Pikes RStan script with minor modification to make it punishment only to see if it runs. Testing that the approach works with the current data set up etc.
The settings for the script are below, including stan chain paramters and directory set up.
This loads the libraries and source files needed to run this script, and sets up RStan
only doing this ‘accurately’ for the acquisition CS+, as the simulations require probability. I am using contingency for this (0.75). If set for 0 for all other phases and stimuli then it looks as if the learning should be flat regardless of alpha. We expect in reality that this probability will vary between people and will be unlikely to be zero. So test 12 and 18 trials with a probability of 0.5 and 0.2 as well.
[1] "Simulated learning rates. 12 trials; probability = 0.75 (CSp acq contingency) \n"
[1] "Simulated learning rates. 12 trials; Probability = 0.5\n"
[1] "Simulated learning rates. 12 trials; Probability = 0.2\n"
[1] "Simulated learning rates. 18 trials; Probability = 0.5\n"
[1] "Simulated learning rates. 18 trials; Probability = 0.2\n"
Attaching package: ‘reshape2’
The following objects are masked from ‘package:data.table’:
dcast, melt
The following object is masked from ‘package:tidyr’:
smiths
The following objects are masked from ‘package:reshape’:
colsplit, melt, recast
See if the basic punishment only learning model for the CS+ and CS- works with the FLARe master data
From the rstan github
This is to check that all is compiling and working and to give and idea of data format etc.
load in the week 1 app and lab data for FLARe pilot, TRT and headphones studies. Make it long form.
Try with acquisition data first. This is formatted with no column names, with no missing data.
Derive the n parameter for both files and check these match
stanname='punish_only.stan'
minus_name <- 'bayes_acq_minus.csv'
plus_name <- "bayes_acq_plus.csv"
stanfile <- file.path(scriptdir, stanname)
minusfile <- file.path(datadir,minus_name)
plusfile <- file.path(datadir,plus_name)
minus <- fread(minusfile,data.table=F)
plus <-fread(plusfile,data.table=F)
nacqm <- dim(minus)[1]
nacqp <- dim(plus)[1]
## check that these match and create nsub variable for RStan
if (nacqm == nacqp) {
print('subject number match')
nsub <- nacqm
print(paste('nsub set to',nsub,sep=" "))
} else {
print('WARNING: subject number does not match. Check master dataset')
}
[1] "subject number match"
[1] "nsub set to 342"
# check the file format is ok
minus[1:2,]
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
1 5 4 3 1 2 2 1 2 3 2 3 2
2 8 8 1 5 4 3 2 1 1 1 1 1
plus[1:2,]
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
1 5 6 7 4 7 7 6 7 2 8 7 8
2 1 9 9 5 8 8 9 9 9 8 9 8
The expectancy rating datasets look like they are formatted fine and ntrials and nsub variables should exist.
Need to go back to stage zero and keep scream yes/no as a variable. For now to see if this runs create simulated version for the CS+. CS- will remain the same.
screamMinus <- matrix(0L,nrow=nsub, ncol=ntrials)
# Initialise plus dataset in the same way, but make the first trial 1 for everyone, then add 8 additional random 1's per person. Do this in four random patterns to mimic the real data
sc1 <- c(1,1,0,1,0,0,1,1,1,1,1)
sc2 <- c(0,1,1,1,0,0,1,1,1,1,1)
sc3 <- c(1,1,1,0,1,0,1,0,1,1,1)
sc4 <- c(1,0,1,1,0,0,1,1,1,1,1)
screamPlus <- matrix(0L,nrow=nsub, ncol=ntrials)
screamPlus[,1] <- 1
# for (n in 1:dim(screamPlus)[1]) {
# print(n)
# screamPlus[n,2:12] <- sample(patts,1,replace=T)
# }
for (n in 1:dim(screamPlus)[1]) {
a <- sample(c(1,4),1)
if (a == 1) {
screamPlus[n,2:12] <- sc1
} else if (a == 2) {
screamPlus[n,2:12] <- sc2
} else if (a == 3){
screamPlus[n,2:12] <- sc3
} else {
screamPlus[n,2:12] <- sc4
}
}
for now to see if stan runs using bernoulli-logit function make binary resposnes from expectancy i.e. >=4.5 ==1, <= 4.5 ==0.
binarise <- function(x) {
ifelse(x >= 4.5,1,0)
}
minusb <- data.frame(apply(minus,2,function(x) binarise(x)))
plusb <- data.frame(apply(plus,2,function(x) binarise(x)))
This directs to my local machine here /Users/kirstin/Dropbox/SGDP/FLARe/FLARe_MASTER/Projects/Hierachal_modelling/Scripts and is remotely linked to the github repository here.
git pull Bayes_modelling
Unhash this if you want to check what the model looks like within the notebook.
stanname="punish_only.stan"
scriptdir="/Users/kirstin/Dropbox/SGDP/FLARe/FLARe_MASTER/Projects/Hierachal_modelling/Scripts"
#cat $scriptdir/$stanname
use echo to push these to the new file if you want to make changes from here.
## initialise bash directory and filename
stanname="punish_only.stan"
scriptdir="/Users/kirstin/Dropbox/SGDP/FLARe/FLARe_MASTER/Projects/Hierachal_modelling/Scripts"
#echo "<any changes here>" > $scriptdir/$stanname
unhash this to run experimental script that checked if stan runs. This was mostly to check data formatting and installation / compilation etc.
flare_data<-list(ntrials=ntrials,nsub=nsub,includeTrial = rep(1,ntrials), screamPlus=t(screamPlus),screamMinus=t(screamMinus),
ratingPlus=t(plusb),ratingMinus=t(minusb))
#flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
#save(flare_fit, file=file.path(datadir,'flare_fit_test'))
#traceplot(flare_fit,'lp__')
# extract fit data
#summary_flare<- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
view the fit information
#summary_flare
extract the loglikelihood using loo
#loo(flare_loglike)
so, good news is this all works. So preliminary check a success. Next need to consider the appropriate model.
We need to rescale our dataset here to be between 0 and 1.
Importantly, because we are using the proportion of trials that are not reinforced as a known paramter for statistical reasons (we don’t want a proportion of .75 and 1, better to have .25 and 0), we have made our rescaled expectancy values as 1 - rescaled(x). This means that we will still be able to interpret the results in the expected way (i.e. higher rating is greater expectation of the outcome).
rescale the 1-9 expectancy values to be on a 0-1 scale.
stan cannot deal with the extreme limit of the beta, so make the rescaled limits just above 0 and below one
library(scales)
# rescale and flip so that we are effectively rating the expectation that they WILL NOT hear a scream to match stan
minus_scaled <- data.frame(apply(minus,2,function(x) 1-rescale(x, to=c(0.00001,0.99999))))
plus_scaled <- data.frame(apply(plus,2,function(x) 1-rescale(x, to=c(0.00001,0.99999))))
This is a vector containing the absolute number of trials where no scream occurred for each stimulus. As there was a 75% reinforcement rate for the CS+ (9/12 trials), this is a vector of ’3’s. For the CS-, no trials were reinforced so is a vector of ’12’s
No_scream_p <- rep(3,nsub)
No_scream_m <- rep(12,nsub)
Create datasets for the acquisition CS- and extinction CS+ and CS- reflecting that no screams occurred at all. Then use the pattern id variable to create a dataset for the acquisition CS+ indicating when a scream occurred for each participant.
## Create the no scream daatsets for all
screamMinus <- matrix(0L,nrow=nsub, ncol=ntrials)
# Initialise plus dataset in the same way, but make the first trial 1 for everyone, then add 8 additional random 1's per person. Do this in four random patterns to mimic the real data
sc1 <- c(1,1,0,1,0,0,1,1,1,1,1)
sc2 <- c(0,1,1,1,0,0,1,1,1,1,1)
sc3 <- c(1,1,1,0,1,0,1,0,1,1,1)
sc4 <- c(1,0,1,1,0,0,1,1,1,1,1)
screamPlus <- matrix(0L,nrow=nsub, ncol=ntrials)
screamPlus[,1] <- 1
# for (n in 1:dim(screamPlus)[1]) {
# print(n)
# screamPlus[n,2:12] <- sample(patts,1,replace=T)
# }
for (n in 1:dim(screamPlus)[1]) {
a <- sample(c(1,4),1)
if (a == 1) {
screamPlus[n,2:12] <- sc1
} else if (a == 2) {
screamPlus[n,2:12] <- sc2
} else if (a == 3){
screamPlus[n,2:12] <- sc3
} else {
screamPlus[n,2:12] <- sc4
}
}
Because we use the 1-rescaled expectancy data, no need to try and invert to reinforcement paramters here. As a result we need the stan model to simply be:
alphaPlus[p] = nothingPlus[p]/ntrials;
alphaMinus[p] = nothingMinus[p]/ntrials;
here we try to estimate the alpha paramter of the beta distribution per trial per person per stimulus. (i.e. you have two sufficient paramters for each beta dist, the alpha and beta. we want to estimate the alpha - ).
Eventually we will scale these by the actual ‘value’ of the scream for each person per trial.
Using data loaded in from preliminary tests above.
so this is a beta value per person (assuming the underlying process for the plus and minus are the same)
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_noscaling.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,nothingPlus = No_scream_p, nothingMinus=No_scream_m,ratingsPlus=plus_scaled,ratingsMinus=minus_scaled)
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
save(flare_fit, file=file.path(datadir,'flare_fit_test'))
traceplot(flare_fit,'lp__')
# extract fit data
summary_flare<- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
## get some basic output descriptions printed to screen
out_describe(summary_flare)
Simple alteration of the first model. We estimate a scaling parameter per person over all trials and apply this to alpha component per participant.
here we try to estimate the alpha paramter of the beta distribution per trial per person per stimulus. (i.e. you have two sufficient paramters for each beta dist, the alpha and beta. we want to estimate the alpha - ).
Eventually we will scale these by the actual ‘value’ of the scream for each person per trial.
Using data loaded in from preliminary tests above.
so this is a beta value per person (assuming the underlying process for the plus and minus are the same)
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_scaling.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,nothingPlus = No_scream_p, nothingMinus=No_scream_m,ratingsPlus=plus_scaled,ratingsMinus=minus_scaled)
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
save(flare_fit, file=file.path(datadir,'flare_fit_test'))
traceplot(flare_fit,'lp__')
# extract fit data
summary_flare<- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
## get some basic output descriptions printed to screen
out_describe(summary_flare)
here we try to estimate the alpha paramter of the beta distribution per trial per person per stimulus. (i.e. you have two sufficient paramters for each beta dist, the alpha and beta. we want to estimate the alpha - ).
Eventually we will scale these by the actual ‘value’ of the scream for each person per trial.
Using data loaded in from preliminary tests above.
so this is a beta value per person (assuming the underlying process for the plus and minus are the same)
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_withRL.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
save(flare_fit, file=file.path(datadir,'flare_fit_test'))
traceplot(flare_fit,'lp__')
# extract fit data
summary_flare <- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
## get some basic output descriptions printed to screen
out_describe(summary_flare)
this model includes an alpha learning paramater per person estimating their learning rate and updating based on it. This model needs a dataset that indicates whether a scream occurred for each trial instead of the proportion of times no scream occurred.
this model includes an alpha learning paramater per person estimating their learning rate and updating based on it. This model needs a dataset that indicates whether a scream occurred for each trial instead of the proportion of times no scream occurred.
Alex used this stack post to help solve the shape paramters using mean and sd where we assume that v serves as the mean and beta as the sd.
the equations work out to this:
for shape 1:
\[\alpha = \left(\frac{1-\mu}{\sigma^2} - \frac{1}{\mu}\right)\mu^2\]
for shape 2:
\[\beta=\alpha \left(\frac{1}{\mu}-1\right)\]
## get some basic output descriptions printed to screen
out_describe(summary_flare)
[1] "1000 iterations on 1 chains. "
[1] " "
[1] "Alpha descriptives"
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 342 0.21 0.08 0.2 0.2 0.04 0 0.58 0.58 1.27 4.41 0
[1] " "
[1] "Combined beta descriptives"
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 343 0 0.02 0 0 0 0 0.27 0.27 13.27 177.55 0
[1] " "
[1] "Average Rhat"
[1] 0.9995165
On 500 iterations (i.e. test) the variance in alpha is good, but the traceplot is terrible. Model coverges very poorly. We also have to constrain the beta to be betwqeen 0 and 0.0001. Not sure why this is.
when running for 2000 iterations (1000 warmup)…
This results in the following warning;
There were 2644 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmupThere were 4 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceededThere were 4 chains where the estimated Bayesian Fraction of Missing Information was low. See http://mc-stan.org/misc/warnings.html#bfmi-lowExamine the pairs() plot to diagnose sampling problems
The above mean definition does not map the data well (terrible traceplot!). I found this from the MRC BSU and have tried defining the beta parameters assuming V == mean in a slighty different way:
for paramater a:
\[\alpha = \mu\beta/(1-\mu)\]
for parameter b:
\[\beta = \mu(1-\mu)^2/\sigma+\mu-1\]
Still using a single beta here.
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_meansd_RL_2.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
save(flare_fit, file=file.path(datadir,'flare_fit_test'))
traceplot(flare_fit,'lp__')
# extract fit data
summary_flare<- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
## get some basic output descriptions printed to screen
out_describe(summary_flare)
noted that the shape parameters have slight variations in definition according to discussion here. Updated the script slightly to reflect this based on the reply from ocram.
the first sd term in shape a is changed to variance, so it changes from:
\[\alpha = \left(\frac{1-\mu}{\sigma^2} - \frac{1}{\mu}\right)\mu^2\]
to
\[\alpha = \left(\frac{1-\mu}{\sigma} - \frac{1}{\mu}\right)\mu^2\]
Changes the shape 2 paramter definition from:
\[\beta=\alpha \left(\frac{1}{\mu}-1\right)\]
to
\[\beta = \left(\frac{1-\mu}{\sigma} - \frac{1}{\mu}\right)\mu\left(1-\mu\right)\]
## decide testing rate (min,med,max or off)
testing('med')
## set up run
stanname='beta_meansd_RL_3.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
hash mismatch so recompiling; make sure Stan code ends with a blank line
SAMPLING FOR MODEL 'beta_meansd_RL_3' NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 0.01459 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 145.9 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## get some basic output descriptions printed to screen
out_describe(summary_flare)
[1] "1000 iterations on 1 chains. "
[1] " "
[1] "Alpha descriptives"
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 342 0.21 0.08 0.2 0.2 0.04 0 0.58 0.58 1.27 4.41 0
[1] " "
[1] "Combined beta descriptives"
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 344 0 0.02 0 0 0 0 0.29 0.29 10.82 117.19 0
[1] " "
[1] "Average Rhat"
[1] 0.9995165
This model is substnatially better than either of the other two.traceplot suggests that the iterations converge as we would like. However, we still need to massively constrain the beta estimates for it to run, otherwise the starting values drop below zero.
this needs to be investigated…
Used this post to guide this. particularly:
For a beta distribution with shape parameters a and b, the mode is (a-1)/(a+b-2). Suppose we have a desired mode, and we want to determine the corresponding shape parameters. Here’s the solution. First, we express the “certainty” of the estimate in terms of the equivalent prior sample size, k=a+b, with k≥2. The certainty must be at least 2 because it essentially assumes that the prior contains at least one “head” and one “tail,” which is to say that we know each outcome is at least possible. Then a little algebra reveals: a = mode * (k-2) + 1 b = (1-mode) * (k-2) + 1
For this version we try and estimate the ‘mode’ to be shape 1. KIRSTIN:: explain here
## decide testing rate (min,med,max or off)
testing('skip')
## set up run
stanname='beta_mode_RL.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
save(flare_fit, file=file.path(datadir,'flare_fit_test'))
traceplot(flare_fit,'lp__')
# extract fit data
summary_flare<- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
## get some basic output descriptions printed to screen
out_describe(summary_flare)
For this version we assume that V is the mode (above we assumed it serves as the mean) and beta is the certainty aspect (i.e. k)
What this does is basically treat the expected rating (value) as the a parameter for the distribution (scaled by their certainity - beta) and 1-that value as the b parameter (again, scaled by the uncertainty).
so you have a ratio of their selected value per trial (mode across iterations?) to how far from the highest possible choice they are.
## decide testing rate (min,med,max or off)
testing('med')
## set up run
stanname='beta_mode_RL_2.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
save(flare_fit, file=file.path(datadir,'flare_fit_test'))
traceplot(flare_fit,'lp__')
# extract fit data
summary_flare<- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
## get some basic output descriptions printed to screen
out_describe(summary_flare)
This works, but there is not a lot of variance in the alpha paramter when described by mode mean 0.49; sd = 0.06. Compared to defined by mean where mean is 0.54 and sd is 0.26.
However there is a lot of variation in the beta paramter (mean -7.21, sd = 134.74)
RL model adding a beta per stimulus to Alex’s model
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_meansd_2beta_RL.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter,warmup = warm_up, chains = chain_n) #add working dir?
hash mismatch so recompiling; make sure Stan code ends with a blank line
SAMPLING FOR MODEL 'beta_meansd_2beta_RL' NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 0.017119 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 171.19 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
Chain 1: WARNING: There aren't enough warmup iterations to fit the
Chain 1: three stages of adaptation as currently configured.
Chain 1: Reducing each adaptation stage to 15%/75%/10% of
Chain 1: the given number of warmup iterations:
Chain 1: init_buffer = 15
Chain 1: adapt_window = 75
Chain 1: term_buffer = 10
Chain 1:
Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
Chain 1: Iteration: 101 / 400 [ 25%] (Sampling)
Chain 1: Iteration: 140 / 400 [ 35%] (Sampling)
Chain 1: Iteration: 180 / 400 [ 45%] (Sampling)
Chain 1: Iteration: 220 / 400 [ 55%] (Sampling)
Chain 1: Iteration: 260 / 400 [ 65%] (Sampling)
Chain 1: Iteration: 300 / 400 [ 75%] (Sampling)
Chain 1: Iteration: 340 / 400 [ 85%] (Sampling)
Chain 1: Iteration: 380 / 400 [ 95%] (Sampling)
Chain 1: Iteration: 400 / 400 [100%] (Sampling)
Chain 1:
Chain 1: Elapsed Time: 103.872 seconds (Warm-up)
Chain 1: 2574.24 seconds (Sampling)
Chain 1: 2678.11 seconds (Total)
Chain 1:
There were 300 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
## get some basic output descriptions printed to screen
out_describe(summary_flare)
RL model adding a beta per stimuli to model defining the beta shape using the mode instead of the mean. This definitely makes more sense as we assume that they will have different levels of uncertainty about each.
## decide testing rate (min,med,max or off)
testing('off')
## set up run
stanname='beta_mode_2beta_RL_2.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
save(flare_fit, file=file.path(datadir,'flare_fit_test'))
traceplot(flare_fit,'lp__')
# extract fit data
summary_flare <- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
## get some basic output descriptions printed to screen
out_describe(summary_flare)
The alpha parameter variance is normal (mean 0.4 and sd 0.12). Beta is much more bounded now though (combined across both stimuli mean 0.79, sd=1.6) over 4000 iterations on 4 chains.
Unhash the series below if you made any changes.
## initialise bash directory and filename
stanname="punish_only.stan"
scriptdir="/Users/kirstin/Dropbox/SGDP/FLARe/FLARe_MASTER/Projects/Hierachal_modelling/Scripts"
## stage
#git add $scriptdir/$stanname
## push
#git push Bayes_modelling